Upgrade llama.cpp to b9682 and improve CI test diagnostics#239
Merged
Conversation
…stics sync) Insert -Xmx2g into the surefire argLine (repo already had the -XX crash/heap-dump flags and memory before/after CI steps); add -e to the Java test mvn invocations. Implements workspace/policies/ci-test-diagnostics.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
Clean version bump: no project C++ source changes required. The upstream changes in this range are all internal to upstream-compiled translation units, and src/main/cpp references none of the touched symbols (grep for mtmd/speculative/draft/process_chunk/build_lora_mm_id returns zero matches): - process_chunk removed, folded into mtmd_helper_eval_chunk_single - mtmd_helper_decode_image_chunk gained a post-decode callback + user_data - build_lora_mm_id gained a w_s scale-weight argument - speculative decoding: per-position acceptance stats + Eagle3 backend sampling - server-context refactor lets an mtmd prompt feed a speculative draft model Verified: mvn compile (JNI headers) and cmake configure against b9682 both succeed. Documented in docs/history/llama-cpp-breaking-changes.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
….11.0 Mirrors the streambuffer Dependabot updates (#108, #109) on the java-llama.cpp branch. Both target versions are the current latest releases on Maven Central. Verified: - spotless:check passes with 3.7.0 (no reformatting of existing sources; palantir-java-format stays pinned at 2.92.0) - central-publishing 0.11.0 resolves from Maven Central (used only by the release/deploy profile) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM
|
bernardladenthin
pushed a commit
that referenced
this pull request
Jun 18, 2026
The Win32 (x86) C++ test job intermittently failed at build-time gtest_discover_tests. llama/ggml/mtmd are linked statically into one large jllama_test binary; on 32-bit Windows its startup plus --gtest_list_tests enumeration sits near the default 5s discovery timeout on shared CI runners. The same b9682 binary discovered within 5s in the #239 merge run but was killed at the 5s timeout in this run (process still alive, empty output — a timeout, not a crash); the b9682 upgrade and 5 newly added tests nudged a marginal case over the limit. x64, Linux and macOS finish well under the default and are unaffected. Raise DISCOVERY_TIMEOUT to 120s (a maximum, so fast platforms still return immediately), which keeps full C++ test coverage on x86 rather than skipping the binary there. Verified locally: 445/445 C++ tests still pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014L2dLbAtwdq7C6a2gFRsQQ
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
-eflag to all Maven test invocations in CI workflows for enhanced error diagnosticsTest plan
Related issues / PRs
None
Checklist
CONTRIBUTING.mdandCODE_OF_CONDUCT.mdSECURITY.md)Details
llama.cpp upgrade (b9642 → b9682):
The upstream release includes new multimodal speculative-draft decoding capabilities (post-decode callbacks in
mtmd_helper_decode_image_chunk, per-position draft-acceptance statistics in speculative decoding, and Eagle3 backend-sampling). These are internal to the upstream-compiled libraries and server slot state machine; no changes to jllama's C++ bindings or Java API are required. All changes are documented indocs/history/llama-cpp-breaking-changes.md.CI improvements:
-eflag to Maven invocations across all test jobs (jcstress, Linux, macOS, Windows) to capture full error output and stack traces, improving diagnostics for test failuresargLineheap limit from default to-Xmx2gto prevent out-of-memory errors during test executionDependency updates:
https://claude.ai/code/session_01JBzF5wtCjRu5t4FMzphryM